Skip to content

Qwen3: fix quality loss due to rope freq precision#3005

Merged
greenrazer merged 2 commits into
huggingface:mainfrom
zackangelo:qwen3_rope_fix
Jun 26, 2025
Merged

Qwen3: fix quality loss due to rope freq precision#3005
greenrazer merged 2 commits into
huggingface:mainfrom
zackangelo:qwen3_rope_fix

Conversation

@zackangelo

Copy link
Copy Markdown
Contributor

When running Qwen3 at bf16, I noticed a substantial loss in quality and bugs (skipping tokens) especially when dealing with multi-piece tokens/multi-byte characters. Fixed by doing all intermediate rope freq and sin/cos computation in F32, then casting down to the requested dtype.

Also changed the candle example to run in bf16 by default on metal.

@greenrazer greenrazer merged commit ab14581 into huggingface:main Jun 26, 2025
3 of 18 checks passed
@greenrazer

Copy link
Copy Markdown
Contributor

Thanks!

john-sharratt pushed a commit to john-sharratt/candle that referenced this pull request May 7, 2026
* qwen3 bugfix: compute rope freqs in f32

* qwen example: run model in bf16 on metal
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants